skip to main content


Search for: All records

Creators/Authors contains: "Wu, Yujie"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. The Alchemical Transfer Method (ATM) is herein validated against the relative binding free energies of a diverse set of protein-ligand complexes. We employed a streamlined setup workflow, a bespoke force field, and the AToM-OpenMM software to compute the relative binding free energies (RBFE) of the benchmark set prepared by Schindler and collaborators at Merck KGaA. This benchmark set includes examples of standard small R-group ligand modifications as well as more challenging scenarios, such as large R-group changes, scaffold hopping, formal charge changes, and charge-shifting transformations. The novel coordinate perturbation scheme and a dual-topology approach of ATM address some of the challenges of single-topology alchemical relative binding free energy methods. Specifically, ATM eliminates the need for splitting electrostatic and Lennard-Jones interactions, atom mapping, defining ligand regions, and post-corrections for charge-changing perturbations. Thus, ATM is simpler and more broadly applicable than conventional alchemical methods, especially for scaffold-hopping and charge-changing transformations. Here, we performed well over 500 relative binding free energy calculations for eight protein targets and found that ATM achieves accuracy comparable to existing state-of-the-art methods, albeit with larger statistical fluctuations. We discuss insights into specific strengths and weaknesses of the ATM method that will inform future deployments. This study confirms that ATM is applicable as a production tool for relative binding free energy (RBFE) predictions across a wide range of perturbation types within a unified, open-source framework. 
    more » « less
    Free, publicly-accessible full text available August 16, 2024
  2. Key Points A method to concoct non‐stationary data series is proposed Eddy covariance and wavelet analysis methods underestimate turbulent momentum flux under non‐stationary condition by about 50% Mexican hat wavelet method has the potential to accurately calculate flux of non‐stationary turbulence after correction 
    more » « less
  3. Wren, Jonathan (Ed.)
    Abstract Motivation In the training of predictive models using high-dimensional genomic data, multiple studies’ worth of data are often combined to increase sample size and improve generalizability. A drawback of this approach is that there may be different sets of features measured in each study due to variations in expression measurement platform or technology. It is often common practice to work only with the intersection of features measured in common across all studies, which results in the blind discarding of potentially useful feature information that is measured in individual or subsets of studies. Results We characterize the loss in predictive performance incurred by using only the intersection of feature information available across all studies when training predictors using gene expression data from microarray and sequencing datasets. We study the properties of linear and polynomial regression for imputing discarded features and demonstrate improvements in the external performance of prediction functions through simulation and in gene expression data collected on breast cancer patients. To improve this process, we propose a pairwise strategy that applies any imputation algorithm to two studies at a time and averages imputed features across pairs. We demonstrate that the pairwise strategy is preferable to first merging all datasets together and imputing any resulting missing features. Finally, we provide insights on which subsets of intersected and study-specific features should be used so that missing-feature imputation best promotes cross-study replicability. Availability and implementation The code is available at https://github.com/YujieWuu/Pairwise_imputation. Supplementary information Supplementary information is available at Bioinformatics online. 
    more » « less